26 research outputs found
Learning strategies for improving neural networks for image segmentation under class imbalance
This thesis aims to improve convolutional neural networks (CNNs) for image segmentation under class imbalance, which is referred to the problem of training dataset when the class distributions are unequal. We particularly focus on medical image segmentation because of its imbalanced nature and clinical importance.
Based on our observations of model behaviour, we argue that CNNs cannot generalize well on imbalanced segmentation tasks, mainly because of two counterintuitive reasons. CNNs are prone to overfit the under-represented foreground classes as it would memorize the regions of interest (ROIs) in the training data because they are so rare. Besides, CNNs could underfit the heterogenous background classes as it is difficult to learn from the samples with diverse and complex characteristics. Those behaviours of CNNs are not limited to specific loss functions.
To address those limitations, firstly we propose novel asymmetric variants of popular loss functions and regularization techniques, which are explicitly designed to increase the variance of foreground samples to counter overfitting under class imbalance. Secondly we propose context label learning (CoLab) to tackle background underfitting by automatically decomposing the background class into several subclasses. This is achieved by optimizing an auxiliary task generator to generate context labels such that the main network will produce good ROIs segmentation performance. Then we propose a meta-learning based automatic data augmentation framework which builds a balance of foreground and background samples to alleviate class imbalance. Specifically, we learn class-specific training-time data augmentation (TRA) and jointly optimize TRA and test-time data augmentation (TEA) effectively aligning training and test data distribution for better generalization. Finally, we explore how to estimate model performance under domain shifts when trained with imbalanced dataset. We propose class-specific variants of existing confidence-based model evaluation methods which adapts separate parameters per class, enabling class-wise calibration to reduce model bias towards the minority classes.Open Acces
Robustness Stress Testing in Medical Image Classification
Deep neural networks have shown impressive performance for image-based
disease detection. Performance is commonly evaluated through clinical
validation on independent test sets to demonstrate clinically acceptable
accuracy. Reporting good performance metrics on test sets, however, is not
always a sufficient indication of the generalizability and robustness of an
algorithm. In particular, when the test data is drawn from the same
distribution as the training data, the iid test set performance can be an
unreliable estimate of the accuracy on new data. In this paper, we employ
stress testing to assess model robustness and subgroup performance disparities
in disease detection models. We design progressive stress testing using five
different bidirectional and unidirectional image perturbations with six
different severity levels. As a use case, we apply stress tests to measure the
robustness of disease detection models for chest X-ray and skin lesion images,
and demonstrate the importance of studying class and domain-specific model
behaviour. Our experiments indicate that some models may yield more robust and
equitable performance than others. We also find that pretraining
characteristics play an important role in downstream robustness. We conclude
that progressive stress testing is a viable and important tool and should
become standard practice in the clinical validation of image-based disease
detection models.Comment: 11 page
Post-Deployment Adaptation with Access to Source Data via Federated Learning and Source-Target Remote Gradient Alignment
Deployment of Deep Neural Networks in medical imaging is hindered by
distribution shift between training data and data processed after deployment,
causing performance degradation. Post-Deployment Adaptation (PDA) addresses
this by tailoring a pre-trained, deployed model to the target data distribution
using limited labelled or entirely unlabelled target data, while assuming no
access to source training data as they cannot be deployed with the model due to
privacy concerns and their large size. This makes reliable adaptation
challenging due to limited learning signal. This paper challenges this
assumption and introduces FedPDA, a novel adaptation framework that brings the
utility of learning from remote data from Federated Learning into PDA. FedPDA
enables a deployed model to obtain information from source data via remote
gradient exchange, while aiming to optimize the model specifically for the
target domain. Tailored for FedPDA, we introduce a novel optimization method
StarAlign (Source-Target Remote Gradient Alignment) that aligns gradients
between source-target domain pairs by maximizing their inner product, to
facilitate learning a target-specific model. We demonstrate the method's
effectiveness using multi-center databases for the tasks of cancer metastases
detection and skin lesion classification, where our method compares favourably
to previous work. Code is available at: https://github.com/FelixWag/StarAlignComment: This version was accepted for the Machine Learning in Medical Imaging
(MLMI 2023) workshop at MICCAI 202
Joint Optimization of Class-Specific Training- and Test-Time Data Augmentation in Segmentation
This paper presents an effective and general data augmentation framework for
medical image segmentation. We adopt a computationally efficient and
data-efficient gradient-based meta-learning scheme to explicitly align the
distribution of training and validation data which is used as a proxy for
unseen test data. We improve the current data augmentation strategies with two
core designs. First, we learn class-specific training-time data augmentation
(TRA) effectively increasing the heterogeneity within the training subsets and
tackling the class imbalance common in segmentation. Second, we jointly
optimize TRA and test-time data augmentation (TEA), which are closely connected
as both aim to align the training and test data distribution but were so far
considered separately in previous works. We demonstrate the effectiveness of
our method on four medical image segmentation tasks across different scenarios
with two state-of-the-art segmentation models, DeepMedic and nnU-Net. Extensive
experimentation shows that the proposed data augmentation framework can
significantly and consistently improve the segmentation performance when
compared to existing solutions. Code is publicly available.Comment: Accepted by IEEE Transactions on Medical Imagin
Multi-source Education Knowledge Graph Construction and Fusion for College Curricula
The field of education has undergone a significant transformation due to the
rapid advancements in Artificial Intelligence (AI). Among the various AI
technologies, Knowledge Graphs (KGs) using Natural Language Processing (NLP)
have emerged as powerful visualization tools for integrating multifaceted
information. In the context of university education, the availability of
numerous specialized courses and complicated learning resources often leads to
inferior learning outcomes for students. In this paper, we propose an automated
framework for knowledge extraction, visual KG construction, and graph fusion,
tailored for the major of Electronic Information. Furthermore, we perform data
analysis to investigate the correlation degree and relationship between
courses, rank hot knowledge concepts, and explore the intersection of courses.
Our objective is to enhance the learning efficiency of students and to explore
new educational paradigms enabled by AI. The proposed framework is expected to
enable students to better understand and appreciate the intricacies of their
field of study by providing them with a comprehensive understanding of the
relationships between the various concepts and courses.Comment: accepted by ICALT202
Causality-inspired Single-source Domain Generalization for Medical Image Segmentation
Deep learning models usually suffer from domain shift issues, where models
trained on one source domain do not generalize well to other unseen domains. In
this work, we investigate the single-source domain generalization problem:
training a deep network that is robust to unseen domains, under the condition
that training data is only available from one source domain, which is common in
medical imaging applications. We tackle this problem in the context of
cross-domain medical image segmentation. Under this scenario, domain shifts are
mainly caused by different acquisition processes. We propose a simple
causality-inspired data augmentation approach to expose a segmentation model to
synthesized domain-shifted training examples. Specifically, 1) to make the deep
model robust to discrepancies in image intensities and textures, we employ a
family of randomly-weighted shallow networks. They augment training images
using diverse appearance transformations. 2) Further we show that spurious
correlations among objects in an image are detrimental to domain robustness.
These correlations might be taken by the network as domain-specific clues for
making predictions, and they may break on unseen domains. We remove these
spurious correlations via causal intervention. This is achieved by resampling
the appearances of potentially correlated objects independently. The proposed
approach is validated on three cross-domain segmentation tasks: cross-modality
(CT-MRI) abdominal image segmentation, cross-sequence (bSSFP-LGE) cardiac MRI
segmentation, and cross-center prostate MRI segmentation. The proposed approach
yields consistent performance gains compared with competitive methods when
tested on unseen domains.Comment: Preprin
Hand Pose-based Task Learning from Visual Observations with Semantic Skill Extraction
Learning from Demonstrations is a promising technique to transfer task knowledge from a user to a robot. We propose a framework for task programming by observing the human hand pose and object locations solely with a depth camera. By extracting skills from the demonstrations, we are able to represent what the robot has learned, generalize to unseen object locations and optimize the robotic execution instead of replaying a non-optimal behavior. A two-staged segmentation algorithm that employs skill template matching via Hidden Markov Models has been developed to extract motion primitives from the demonstration and gives them semantic meanings. In this way, the transfer of task knowledge has been improved from a simple replay of the demonstration towards a semantically annotated, optimized and generalized execution. We evaluated the extraction of a set of skills in simulation and prove that the task execution can be optimized by such means